Picture for Han Cai

Han Cai

JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search

Add code
May 26, 2026
Viaarxiv icon

Hide to Guide: Learning via Semantic Masking

Add code
May 24, 2026
Viaarxiv icon

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Add code
May 13, 2026
Viaarxiv icon

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Add code
Feb 03, 2026
Viaarxiv icon

Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow

Add code
Jan 20, 2026
Viaarxiv icon

Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Add code
May 26, 2025
Viaarxiv icon

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Add code
May 24, 2025
Viaarxiv icon

Scaling Vision Pre-Training to 4K Resolution

Add code
Mar 25, 2025
Viaarxiv icon

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

Add code
Feb 03, 2025
Figure 1 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 2 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 3 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 4 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Viaarxiv icon

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer

Add code
Jan 30, 2025
Figure 1 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 2 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 3 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 4 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Viaarxiv icon